72 research outputs found

    Very fast optimal bandwidth selection for univariate kernel density estimation

    Get PDF
    Most automatic bandwidth selection procedures for kernel density estimates require estimation of quantities involving the density derivatives. Estimation of modes and inflexion points of densities also require derivative estimates. The computational complexity of evaluating the density derivative at M evaluation points given N sample points from the density is O(MN). In this paper we propose a computationally efficient ϵ\epsilon-exact approximation algorithm for univariate, Gaussian kernel based, density derivative estimation that reduces the computational complexity from O(MN) to linear order (O(N+M)). The constant depends on the desired arbitrary accuracy, ϵ\epsilon. We apply the density derivative evaluation procedure to estimate the optimal bandwidth for kernel density estimation, a process that is often intractable for large data sets. For example for N = M = 409,600 points while the direct evaluation of the density derivative takes around 12.76 hours the fast evaluation requires only 65 seconds with an error of around 10^{-12). Algorithm details, error bounds, procedure to choose the parameters and numerical experiments are presented. We demonstrate the speedup achieved on the bandwidth selection using the ``solve-the-equation plug-in method'' [18]. We also demonstrate that the proposed procedure can be extremely useful for speeding up exploratory projection pursuit techniques

    POSITION CALIBRATION OF ACOUSTIC SENSORS AND ACTUATORS ON DISTRIBUTED GENERAL PURPOSE COMPUTING PLATFORMS

    Get PDF
    An algorithm is presented to automatically determine the relative 3D positions of audio sensors and actuators in an ad-hoc distributed network of heterogeneous general purpose computing platforms. A closed form approximate solution is derived, which is further refined by minimizing a non-linear error function. Our formulation and solution accounts for the lack of temporal synchronization among different platforms. We also derive an approximate expression for the mean and covariance of the implicitly defined estimator. The theoretical performance limits for the sensor positions are derived and analyzed with respect to the number of sensors and actuators as well as their geometry. We report extensive simulation results and discuss the practical details of implementing our algorithms

    Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks

    Full text link
    The stream of words produced by Automatic Speech Recognition (ASR) systems is typically devoid of punctuations and formatting. Most natural language processing applications expect segmented and well-formatted texts as input, which is not available in ASR output. This paper proposes a novel technique of jointly modeling multiple correlated tasks such as punctuation and capitalization using bidirectional recurrent neural networks, which leads to improved performance for each of these tasks. This method could be extended for joint modeling of any other correlated sequence labeling tasks.Comment: Accepted in Interspeech 201

    DeepSolarEye: Power Loss Prediction and Weakly Supervised Soiling Localization via Fully Convolutional Networks for Solar Panels

    Full text link
    The impact of soiling on solar panels is an important and well-studied problem in renewable energy sector. In this paper, we present the first convolutional neural network (CNN) based approach for solar panel soiling and defect analysis. Our approach takes an RGB image of solar panel and environmental factors as inputs to predict power loss, soiling localization, and soiling type. In computer vision, localization is a complex task which typically requires manually labeled training data such as bounding boxes or segmentation masks. Our proposed approach consists of specialized four stages which completely avoids localization ground truth and only needs panel images with power loss labels for training. The region of impact area obtained from the predicted localization masks are classified into soiling types using the webly supervised learning. For improving localization capabilities of CNNs, we introduce a novel bi-directional input-aware fusion (BiDIAF) block that reinforces the input at different levels of CNN to learn input-specific feature maps. Our empirical study shows that BiDIAF improves the power loss prediction accuracy by about 3% and localization accuracy by about 4%. Our end-to-end model yields further improvement of about 24% on localization when learned in a weakly supervised manner. Our approach is generalizable and showed promising results on web crawled solar panel images. Our system has a frame rate of 22 fps (including all steps) on a NVIDIA TitanX GPU. Additionally, we collected first of it's kind dataset for solar panel image analysis consisting 45,000+ images.Comment: Accepted for publication at WACV 201

    Fast Computation of Sums of Gaussians in High Dimensions

    Get PDF
    Evaluating sums of multivariate Gaussian kernels is a key computational task in many problems in computational statistics and machine learning. The computational cost of the direct evaluation of such sums scales as the product of the number of kernel functions and the evaluation points. The fast Gauss transform proposed by Greengard and Strain (1991) is a ϵ\epsilon-exact approximation algorithm that reduces the computational complexity of the evaluation of the sum of NN Gaussians at MM points in dd dimensions from O(MN)\mathcal{O}(MN) to O(M+N)\mathcal{O}(M+N). However, the constant factor in O(M+N)\mathcal{O}(M+N) grows exponentially with increasing dimensionality dd, which makes the algorithm impractical for dimensions greater than three. In this paper we present a new algorithm where the constant factor is reduced to asymptotically polynomial order. The reduction is based on a new multivariate Taylor's series expansion (which can act both as a local as well as a far field expansion) scheme combined with the efficient space subdivision using the kk-center algorithm. The proposed method differs from the original fast Gauss transform in terms of a different factorization, efficient space subdivision, and the use of point-wise error bounds. Algorithm details, error bounds, procedure to choose the parameters and numerical experiments are presented. As an example we shows how the proposed method can be used for very fast ϵ\epsilon-exact multivariate kernel density estimation

    Fast Computation of Kernel Estimators

    Get PDF
    The computational complexity of evaluating the kernel density estimate (or its derivatives) at m evaluation points given n sample points scales quadratically as O(nm)—making it prohibitively expensive for large datasets. While approximate methods like binning could speed up the computation, they lack a precise control over the accuracy of the approximation. There is no straightforward way of choosing the binning parameters a priori in order to achieve a desired approximation error. We propose a novel computationally efficient ε-exact approximation algorithm for the univariate Gaussian kernel-based density derivative estimation that reduces the computational complexity from O(nm) to linear O(n+m). The user can specify a desired accuracy ε. The algorithm guarantees that the actual error between the approximation and the original kernel estimate will always be less than ε. We also apply our proposed fast algorithm to speed up automatic bandwidth selection procedures. We compare our method to the best available binning methods in terms of the speed and the accuracy. Our experimental results show that the proposed method is almost twice as fast as the best binning methods and is around five orders of magnitude more accurate. The software for the proposed method is available online
    corecore